Model Attributes & Explainability

Explaining the result of the modeling is an important step in making use of our analysis. By combining PCA and k-means, and the information contained in the model attributes within a SageMaker trained model, you can learn about a population and remark on some patterns you've found, based on the data.

To access the k-means model attributes, you'll find the following guidance in the main, exercise notebook.

EXERCISE: Access the k-means model attributes

Extract the k-means model attributes from where they are saved as a TAR file in an S3 bucket.

You'll need to access the model by the k-means training job name, and then unzip the file into model_algo-1. Then you can load that file using MXNet, as before.

# download and unzip the kmeans model file
# use the name model_algo-1

# get the trained kmeans params using mxnet
kmeans_model_params = None

print(kmeans_model_params)

Save the model attributes as kmeans_model_params ; you should see that there is only 1 set of model parameters contained within the k-means model: the cluster centroid locations in PCA-transformed, component space.

Cluster Centroids

You know that each of the counties in our US county data, is assigned to a cluster and that indicates something about groupings found by k-means and similarities between counties in the same cluster. But, what exactly are the features that these clusters have in common?

For example, how is cluster 1 any different than cluster 2?

This is what we aim to define by looking at the location of cluster centroids in component space. Since each cluster is defined in component space, we can look at how eighty each component is in defining a certain cluster. Then, we can go one step further and map the components back to the original data features. I encourage you to look at the visualization code in the exercise notebook and watch the next video to see some complete code.

Next Concept